perm filename RFNDIR[154,RWF] blob sn#856247 filedate 1988-04-21 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00007 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002			 		EXTERNAL MEMORY
C00005 00003	Discs
C00011 00004	Coordinates
C00015 00005	\rm
C00022 00006	The problem is now in the classic form for a recursive function.  We have a set
C00025 00007	To prepare ordinary English text for typography by TEX, begin by  entering
C00039 ENDMK
C⊗;
		 		EXTERNAL MEMORY
Tapes

The tapes at LOTS have a density of 600 bytes per inch, and a speed of  45
inches per  second;  this  gives  a  maximum  information  transfer  rate,
ignoring start/stop times and gaps, of 27000 bytes/sec.  At 5  bytes/word,
this implies a time of 85  microsec/word; ordinarily, the inner loop of  a
sorting  algorithm  will  be  much  faster  than  this,  so  tape  sorting
algorithms will  be limited  entirely by  tape speeds,  and  sophisticated
algorithms will be  economical.  Information  is stored in  records of  at
most 30,700 bytes per record, followed by a 0.6 inch gap; crossing the gap
takes 13333 microseconds,  adding an  average of  2 microseconds/word  for
data stored  in long  records.  Tapes  are 2400  feet long;  this gives  a
capacity of 44.67 megabytes, or 8,934 million words.

Faster tape drives with higher  density are available; at 6250  bytes/inch
and 125 inches/sec, the speed and capacity parameters are 781250 bytes/sec
= 156250 words/sec  = 6.4  microsec/word.  At  this speed,  even a  simple
sorting algorithm probably can't keep up  with the tape, and a fast  inner
loop becomes very important.

Both types of  tape can read  either forward or  backward, but only  write
forward.  After a write, all data to  the right of the new information  is
lost, so ordinarily one  thinks of writing  only at the  right end of  the
tape.  (Some tapes are not so restricted, e.g. Dectapes.)  Start/stop time
is ______________; it can be avoided if  the program can keep up with  the
tape (?)
Discs

A disc memory can b seen on many levels.  As a mechanical device, it has a
certain set of  parameters (speed, capacity,  access times,  constraints).
When combined with its electronic  controller the parameters are  changed,
never for the better.  When  incorporated into a computer, the  parameters
may change again.  The  adoption of a  file-handling system and  operating
system changes them once more.   We shall look at  the LOTS discs at  each
level.

At LOTS, the discs are of type 3330  (the same as at SCORE).  There are  4
disc drives, each of 19 surfaces,  800 tracks per surface, 20 sectors  per
track, 640 bytes (or 128 words) per  sector.  Rotation time is 1/60 sec  =
16,667 microseconds.  There is  one arm per surface,  which must be  moved
firm track to track.  All  arms move together; at  any one time, all  arms
are on the  same cylinder of  tracks (?)   Seek time for  moving from  one
track to another is  a complicated function of  track number; it is  never
less than ______, or more than ______, and averages 30 ms.

At the hardware level, then, a track  contains 20 x 640 = 12,800 bytes  or
2560 words, which can be read, starting  anywhere on the track, at a  rate
of 60 x 12800 = 768,000 bytes/sec, or 60 x 2560 = 153,600 words/sec.; that
is, 1,302 microsec/byte, 6.510 microsec/word.  This is fast enough to keep
up with most sorting algorithms.

One can  avoid seek  times  by staying  within  one cylinder;  a  cylinder
contains 19  x 12800  = 243,200  bytes  = 48,640  words.  An  entire  disc
contains 800  x  243,200  bytes  = 194.56  Mbytes  =  38.912  Mword.  (For
comparison, the  Palo  Alto  White Pages  contain  about  160,000  entries
averaging about 50 characters, or 8 Mbytes, 1/25 as much.)

At the operating system level, a  programmer normally is sharing the  disc
drive with many other users; his disc requests find the discs and arms  in
almst random positions, except that  central tracks are most likely.   His
requests for certain tracks will be  queued and processed by an  algorithm
which alternates going outward and  inward much as an elevator  alternates
handling all upward  requests with  all downward ones.   At the  operating
system, information is organized into pages of four consecutive sectors  =
2560 bytes  =  512 words;  all  information  transfer is  by  full  pages.
Successive pages of a file are usually, but not always, consecutive on the
disc.  When a track is reached, reading always starts at the first word of
the tracks.

At the  machine/assembly language  level, disc  pages are  an  addressable
virtual memory.  Some pages  are resident in fast  memory, by whim of  the
operating system.   Nonresident  pages  may  be in  a  disc  file,  or  in
``swapping space'' (the most frequently used central tracks) on the  disc.
A program can give  the operating system advance  warning of its need  for
the next page in a  file.  Disc accesses are  optimized for seek time  but
not for rotational time.

At the Pascal level, a Pascal file is mapped into 32 consecutive pages  of
the virtual address space, which is then treated as described above.

Another type of disc unit  is the IBM 3350,  with capacity of 317  Mbytes.
It has ________ cylinders, 30 tracks  per cylinder, 19 K bytes per  track.
A newer  type  is the  3380,  with  2.5 G  bytes  of storage.   It  has  4
independently moving arms, each with its own set of cylinders; in  effect,
it is equivalent to 4 disc drives, each of 625 Mbyte.

The cost of disc storage  is about 10 Kbytes/dollar;  at this cost, it  is
economically feasible to permanently retain  all programs, text, and  ther
data produced by humans, buying more disc drives when old ones fill.
Coordinates

In drawing graphs of functions of one variable, it is customary to use the
horizontal dimension to represent the independent variable, usually called
x, increasing from left to right.  The vertical dimension represents the
dependent variable, usually called y, increasing from bottom to top.  We
will call such coordinate systems graph coordinates.  In graph coordinates,
the points shown below have the coordinate pairs written next to them.
























It is customary to identify array elements, however, by a different convention,
as shown by the example below:

	A11	A12	A13	A14

	A21	A22	A23	A24

	A31	A32	A33	A34

Here the first coordinate, often called i, is vertical and increases from top
to bottom; the second, often called j, is horizontal, and increases from left
to right.  We will call such coordinate systems array coordinates.  Also, in
array coordinates, the coordinates usually refer to the squares of the grid,
rather than the corners.  In an array coordinate system, the shaded squares
shown below have the coordinates shown next to them.
























Comparing the two illustrations, notice that the shift between array and graph
coordinates roughly corresponds to a 90 degree rotation of the image.

In computing, it is customary to use array coordinates in referring to arrays
and to input and output pages.  In this text, we usually use R (for row) as
the vertical coordinate, and C (for column) as the horizontal on the printed
page.  Array coordinates correspond to the order in which most computer
printers print the characters on the page, with lines from top to bottom 
and characters left to right within the line.  A procedure call to put an
asterisk as character 9 of line 5 of a page will normally be P('*',9,5).  
Your programs will usually be most intelligible to other readers if you
follow this convention.  Your typical page printing operation will be:

	FOR R:=1 TO 60 DO
		BEGIN
		FOR C:=1 TO 132 DO
			WRITE(IMAGE[R,C]);
		WRITELN
		END

\rm
\font\sevenrm=cmr7
\footnote{}{\sevenrm\noindent integs.tex[1,rfn]}
\centerline{Adaptive Integration: a Study in Recursion}
\vskip .125in
\centerline{R.\ W.\ Floyd}
\centerline{$\copyright$ 1983}

Suppose I want to evaluate $\int↑1↓0e↑{-x↑2}dx$. I look in a table of integrals;
it isn't there.  I ask a mathematician; he says there is no formula for
$\int e↑{-x↑2}dx$.  (More precisely, no formula using the elementary functions;
the integral can be expressed in terms of the {\sl error function},
$\hbox{erf}(x)$, for
which there are tables.  Let's pretend someone has checked out the library's
only copy.)  I will have to approach the problem computationally.  One method
is to approximate the function by narrow trapezoids or rectangles 
\vskip 2in
\noindent
as shown in the figure (in practice, of course, we would use narrower ones).
That is, I can break up $\int↑b↓af(x)dx$ into $\int↑{x↓1}↓{x↓0}f(x)dx
+\int↑{x↓2}↓{x↓1}f(x)dx+\cdots+\int↑{x↓n}↓{x↓{n-1}}f(x)dx$, where
$x↓0=a$, $x↓n=b$, and $x↓0$, $x↓1$, $x↓2,\cdots ,x↓n$ are closely spaced. 
Then I can approximate $\int↑{x↓{i+1}}↓{x↓{i}}f(x)dx$ 
by $0.5(f(x↓i)+f(x↓{i+1}))(x↓{i+1}-x↓i)$, 
or by $f(x↓i)(x↓{i+1}-x↓i)$, or by $f(0.5(x↓i+x↓{i+1}))(x↓{i+1}-x↓i)$, or
something similar.

These methods of numerical integration are in regular use; there are some
analogous methods that tend to be more accurate than the ones above, but the
principle is the same.

We can't be entirely sure, though, that such a method gives an accurate answer. 
If I try to evaluate $\int↑1↓0\cos (200πx)dx$ by evaluating $\cos (200πx)$ at
$x=0,0.01$, $0.02,\cdots ,0.99,$ $1.00$, and then using a formula like the ones
above, I find the integrand is 1.0 at each value of $x$ I try, and the estimate
I come up with will be $\int↑1↓0\cos (200πx)dx=1.0$; the true value, though,
is zero.  By dividing the interval into smaller sections, I could get a more
accurate answer--but how can I be sure it's accurate enough?
\vfill\eject

I have an idea. If I can get one estimate that I know is too large, and another
that I know is too small, and if they differ by less than 0.01, then either one
(or their average) is an accurate enough answer.  For an estimate I know is too
small, I use the trapezoid shown below:\par\vskip 3.5in\par
Since $e↑{-x↑2}$ has a negative second derivative in (0,1), it lies above the
straight line through the endpoints.  
I'll call the area of the trapezoid
$T(0,1)$; $T(0,1)=0.5(f(0)+f(1))=0.5(1.0000+0.3679)=0.6839$.
To get an estimate that I know is too large, I find a trapezoid tangent to
$e↑{-x↑2}$ at the midpoint, as shown below:\par\vskip 3.5in
\vfill\eject

Because the second derivative is negative, the function stays below the straight
line.  I'll call the area $M(0,1)$; it is equal to $f(1/2)=0.7788$.
Notice that it does not depend on the slope of the tangent.
Unfortunately, $0.7788-0.6839=0.0948$, which is much too big. But I can see that
if I subdivide the interval, and use the same technique on each of the smaller
intervals, I may get a more accurate result:\par\vskip 3.5in\par\noindent

Naturally, when I subdivide, I can't allow the error of each half to be as
much as 0.01, but if I require that each half be within 0.005 of the correct
value, the sum will be correct within 0.01.

What if one of the subdivisions still doesn't give me an accurate enough result?
I can cut it in half again, and again; in each region I can subdivide as much
as I need to get an accurate result.  I will probably need to divide smaller
near $x=0$, where the second derivative is large, than near $x=1$, where it
is small.

The diagram below sums up the whole computation.  Each box has a value of $A$,
$C$, and $L$; its job is to evaluate $\int↑C↓Af(x)dx$ with an error no bigger than
$L$.  To do that, it calculates an 
estimate\par\noindent\hbox{$T=0.5(f(A)+f(C))(C-A)$}, which is too
small, and (letting $B=0.5(A+C)$) an estimate $M=f(B)(C-A)$ which is too large.
If $T-M<L$, it calculates $Area$ $=0.5(T+M)$; otherwise, it breaks the interval
into two subintervals, estimates the area of each within a tolerance $L/2$, and
adds the results.
\par\vfill\eject
The problem is now in the classic form for a recursive function.  We have a set
of parameters to define the particular problem; here, they are $A$, $C$, and $L$.
We have a direct way to solve the problem if it is ``small'' enough; here,
if $C-A$ is small enough $T-M$ will be smaller than $L$, 
and we can use ${{T+M}\over 2}$
as the answer.  Otherwise, we subdivide the problem into several smaller ones
of the same kind, each with a new set of parameters, and the function calls
itself to solve the smaller problems.  In Pascal, a recursive function for this
problem would look like this:

\vskip .125in

{\obeylines
\hskip .25in {\tt FUNCTION AREA (A,C,L:REAL):REAL;
\hskip .25in VAR B,D,T,M:REAL;
\hskip .5in BEGIN
\hskip .5in B:=0.5$\ast$(A+C);
\hskip .5in D:=C-A;
\hskip .5in T:=0.5$\ast$(f(A)+f(C))$\ast$D;
\hskip .5in M:=f(B)$\ast$D;
\hskip .5in IF T-M$<$L THEN AREA:=0.5*(T+M)
\hskip .5in ELSE AREA:=AREA(A,B,L/2)+AREA(B,C,L/2)
\hskip .5in END}}

\vskip .125in
\par\noindent
and the main program could call it, for the particular problem we are solving,
with {\tt WRITE(AREA(0,1,0.01))}.

{\sl Exercise}: What would go wrong if we tried to get $\int↑2↓1e↑{-x↑2}dx$
by calling {\tt AREA(1,2, 0.01)}?

{\sl Exercise}: Modify the {\tt AREA} procedure so it also works if the second 
derivative of $f$ is negative throughout $(A,C)$.  How would you 
use the modified function if you want to get $\int↑5↓0e↑{-x↑2}dx$?

\end
To prepare ordinary English text for typography by TEX, begin by  entering
the text into a computer file, with a  name of the form f.TEX, where f  is
any name you choose.  Use freely all printing characters except ${ }# \  ⊗
↑ ↓  %  and quotation  marks,  all  of which  require  special  treatment.
Separate paragraphs  with  a  blank line.   Single  spaces  suffice  after
periods, etc.; TEX will provide the correct spacing automatically.   There
are sections of this introduction to  deal with special problems that  may
come up; where possible, these sections are self-contained.  The  sections
are:

Punctuation
Changing Fonts
Spacing
Lines and Boxes
Mathematical Formulas
Indentation
Titles and Headings
Tables
Page Breaks
Footnotes
Scaling
Errors

If you are willing to use the same standard page size, type font, margins,
line spacing, paragraph form, etc., used in this  document, you may say so
by putting a line at the beginning of your file that says:

\input basic

If not, see the appropriate sections of this document; all the above  must
be specified to use  TEX.  At the end of your file,  put a line that says:

\par\vfill\end

When your file is complete,  you may process it with  TEX for output on  a
Dover or an XGP printer, or on a Datadisc screen.

For output on the Dover printer, type the monitor command

r tex [return]

When the tex program types an asterisk, type 

\input f [return]

where f.tex is the name of your file.

If your file  contains no  TEX errors or  omissions, TEX  will prepare  an
output file  called f.dov  for printing  on the  Dover printer,  and  will
display on the screen a monitor command like

.dover fff.press

to print that file.  If you want the printing done, hit [return] and  wait
for your Dover output.  If not, hit [clear] and do something else.

For output on the XGP printer, give the monitor command

r xgptex [return]

When the tex program types an asterisk, type

\input f [return]

where f.tex is the name of your file.  If your file contains no TEX errors
or omissions, TEX  will prepare  an output file  for printing  on the  XGP
printer, and will display on the screen a monitor command like

.r xgpsyn; f.xgp

If your file contains TEX errors,  you will see an error message,  perhaps
like....  See the section on Errors for help.

Punctuation

Periods, commas, semicolons, colons,  question marks, exclamation  points,
parentheses ( ),  brackets [ ],  and hyphens are  all typed normally.   If
followed by a space, TEX will  print them with the usually correct  amount
of space, except that a period in  an abbreviation may be followed by  too
much space.  If so, use \_ rather than _ after the period.

Certain symbols have a special meaning  in TEX; to enter them as  ordinary
text, they must  be preceded by  \.  Thus to  get the symbol  in the  left
column below, type the sequence in the right column:

	\	\\
	$	\$
	{	\{
	}	\}
	%	\%
	#	\#

To get a single hyphen (e.g. ``twenty-one''),  just type it in.  To get  a
short dash, type two consecutive hyphens.  To get a long dash, type  three
consecutive hyphens.  To get a minus sign, or any other mathematical sign,
see section  Mathematical Formulas.   (If you  actually want  two or  more
consecutive hyphens, see the TEX manual, section ?).  To hyphenate  breaks
in long words see the TEX manual, Chapter 14.

To get single left or right  quote marks  (` '), use the keyboard  symbols
` and '  respectively.   To  get  double  quotes  (``  '')  type a pair of  
single quotes of the correct slant, not the ditto mark ".  To get adjacent 
nested quote marks (`` `),  see the  TEX  manual, Chapter    .   To get an
ellipsis (like one, two,  three,...)   as shown  in  the left column, type
one of the following

$\ldots$
$\ldotss$
$\ldotsm$

where the first gives  three dots with minimum  spacing before and  after;
the second is followed by extra space; the third is preceded and  followed
by extra space.

Accents

To get the character in  the left column below,  type the sequence in  the
right column.  If the accent  you want is not  shown, see the TEX  manual,
Chapter 9 and page 123.

Most of the accents shown on the  letter o can be used with other  letters
and symbols.  The right hand table gives some exceptions.

	o	\'o		a	\a_a
				A	See TEX manual, Ch. 9.
	o	\`o		l	\l_l
				L	See TEX manual, Ch. 9.
	o	\=o		c	\c_c
				C	See TEX manual, Ch. 9.
	o	\"o		i	\'\i_
				j	\'\j_
	o	\A_o		i	\=\i_
				i	\b_i_
	o	\∨_o		j	\b_\j_

	o	\u_o

	o	\H_o

	o	\b_o

	o	\s_o

	oo	\t_oo

	a	\ae_

	A	\AE_

	o	\oe_

	O	\OE_

	o	\o_

	O	\O_

		\ss_

Examples:
	'~	(Spanish
	c, O	(French)
	l	(Polish)
	yu	(Russian)
	o, B	(German)
	a, o	(Danish)

	Ha lich (German: ugly): H\"a\ss lich
	
	canon subterraneo (Spanish: underground canyon): ca\s non
	   subterr\,aneo

	mnozenje (Serbocroatian: multiplication): mno\u zenje

	arboker (Norwegian: annals): \a arb\o ker

Measures

Distances and lengths  in TEX  can be  specified in  several units;  these
include inches, millimeters, and points.  To specify a length exactly, for
example the width of the printed region of the page, enter the length as a
number with optional sign and decimal point, followed by the  abbreviation
for a unit of  measure: in (inches), mm  (millimeters), pt (points;  about
1/72 in; capital letters in the standard fonts are 10pt high), or ex  (the
height of a capital in the  current font; this variable measure is  useful
for dimensions like interline spacing that usually change when you  change
the font size.

Often it is most satisfactory to  allow some play in measurements,  giving
TEX some freedom to shift symbols  around.  To allow play, enter a  length
as |1| plus |2| minus |3|, where  |1| is the preferred length, |2| is  the
maximum desirable stretch,  and |3|  is the  maximum permitted  shrinkage.
The plus |2| or the minus |3| can be omitted if they are zero.

Examples: \vskip 0.25in leaves 1/4 inch  of extra vertical space where  it
appears.  To right-justify  a title,  say ``TEX,'' in  a line  of a  given
length, one  might specify  the length  and then  enter {\hskip  0pt  plus
10000pt TEX}; the  allowed stretch of  the horizontal skip  will fill  the
line.

Spacing

When you begin a  TEX file with \input  basic, standard spacing  measures,
the same as those used  in this document, are  read in.  You may  override
any of them  at any place  in your file.   If the override  is inside  TEX
brackets {  }, it  will only  apply up  to the  closing bracket.   In  the
patterns given below, | | is a measure, as explained in Measures; examples
are 10pt_ (the height of a capital letter) and 0.25in_ (1/4 inch).

To indent the first line of each  paragraph by | |, enter \parindent |  |.
To make the first  line of each  paragraph hang out on  the left, as  with
numbered paragraphs, use a negative measure.  Standard is 20pt.

To get a  page layout in  which the printed  region has width  | |,  enter
\hsize | |.  Standard is 324 points = 4.5 inches.

For a printed region  of height |  |, enter \vsize |  |.  Standard is  504
points = 7 inches.  On the resulting output page, the printed region  will
be horizontally centered.

To set the normal top margin so that the base line (the imaginary line  on
which most letters rest) is down |  | from the top of the printed  region,
enter \topbaseline | |.  Standard is 10pt.

To set the normal distance between successive lines of text at | |,  where
the distance is measured between baselines  and so includes the height  of
the letters on the lower line, enter \baselineskip | |. Standard is 12pt.

To set the normal extra distance between  lines at the end of a  paragraph
to | |, enter \parskip | |.  Standard is 0pt, with 1pt of stretch.

To set  the  separation  of  oversized  lines  (e.g.  lines  that  contain
displayed formulas like     to | |, enter \lineskip | |.  Standard is 1pt.

To set  the spacing  above and  below displayed  equations to  | |,  enter
\dispskip | |.  Standard is 12pt, with 3pt of stretch and 3pt of shrink.

Lines

To make a horizontal line of length  |1| and of this thickness (0.4pt)  at
baseline level, enter \hrule width |1|.  For a different thickness, follow
the entry  by  height  |2|,  where  |2| is  the  measure  of  the  desired
thickness.   To make a vertical line of length |1|,  and of this thickness 
(0.4pt), enter \vrule height |1| depth 0 For a different thickness, follow
the entry  by width  |2| where  |2| is  the desired  thickness.  For  more
details, see TEX manual, Ch. 21.